Attorney Docket No. 0670-7064 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re Patent Application of: 

Yasushi SATO 

Serial No.: 10/559,571 

Filed: December 5, 2005 

For: SPEECH SYNTHESIS FOR 



) Confirmation No.: 8897 
) Examiner: Martin Lemer 
) Group Art Unit: 2626 



SYNTHESIZING MISSING PARTS 



AMENDMENT 



Honorable Commissioner of Patents 
P.O. Box 1450 
Alexandria, VA 22313-1450 

Dear Sir: 

In response to the Official Action mailed October 9, 2009, and the Advisory 
Action mailed February 18, 2010, please consider the following amendments and 
remarks in connection with the above-identified application. 

Amendments to the Claims are reflected in the listing of claims, which begins 
on page 2 of this paper. 

Remarks begin on page 8 of this paper. 
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The listing of claims will replace all prior versions, and listings, of claims in the 
application: 

Listing of Claims: 

1.-41. (Canceled) 

42. (Previously Presented) A speech synthesis device, comprising: 

voice unit storage means for storing a plurality of pieces of voice unit data 
representing voice units; 

phoneme storage means for storing a plurality of pieces of phoneme data each of 
which is a phoneme or comprises phoneme fragments composing a phoneme; 

cadence prediction means for inputting sentence information representing a 
sentence to predict the cadence of voice units composing the sentence; 

selecting means for selecting voice unit data satisfying predetermined conditions 
out of the plurality of pieces of voice unit data stored in the voice unit storage means, 
wherein the predetermined conditions are that the voice unit data to be selected 
matches in its reading with the voice unit composing the sentence and has a correlation 
greater than a predetermined amount with a cadence prediction result by the cadence 
prediction means; 

missing part cadence prediction means for predicting the cadence of voice units 
which have been decided not to satisfy the predetermined conditions by the selection 
means; 

missing part synthesis means for specifying phonemes contained in the voice 
unit decided not to satisfy the predetermined condition by the selection means out of the 
voice units composing the sentence, for acquiring phoneme data representing the 
specified phoneme or phoneme fragments composing the specified phoneme from the 
phoneme storage means, for converting the acquired phoneme data so that the 



- 3 - Application Serial No. 1 0/559,571 

Attorney Docket No. 0670-7064 

phoneme or phoneme fragments represented by the acquired phoneme data matches 
with a cadence prediction result by the missing part cadence prediction means, and for 
interconnecting the converted data, thereby synthesizing speech data representing a 
waveform of the voice unit; and 

creation means for interconnecting the voice unit data selected by the selection 
means and the speech data synthesized by the missing part synthesis means, thereby 
creating data representing synthesis speech. 

43. (Previously Presented) The speech synthesis device according to claim 42, 
wherein the selection means selects the voice unit data out of the plurality of pieces of 
voice unit data stored in the voice unit storage means under the predetermined 
conditions further including that the presence or absence of nasalization or 
devocalization of the voice unit data matches with the cadence prediction result by the 
cadence prediction means. 

44. (Previously Presented) The speech synthesis device according to claim 43, 
wherein the device further comprises utterance speed conversion means for acquiring 
utterance speed data specifying conditions of a speed for producing the synthesis 
speech created by the reaction means and for converting the voice unit data and/or 
speech data so as to represent a speech to be produced at a speed satisfying the 
conditions specified by the utterance speed data. 

45. (Previously Presented) The speech synthesis device according to claim 44, 
wherein the utterance speed conversion means operates to convert the voice unit data 
and/or the speech data so as to represent a speech to be uttered at a speed to be 
produced at a speed satisfying the conditions specified by the utterance speed data, by 
eliminating a segment representing a phoneme fragment from voice unit data and/or 
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speech data composing data representing the synthesis speech or by adding a segment 
representing a phoneme fragment to the voice unit data and/or speech data. 

46. (Currently Amended) The speech synthesis device according to any of 
claims claim 42, wherein the voice unit storage means operates to associate phonetic 
data representing a reading of voice unit with the voice unit data, and the selection 
means operates to handle voice unit data which is associated with phonetic data 
representing a reading matching with the reading of the voice unit composing the 
sentence, as voice unit whose reading is common with the voice unit. 

47. (Currently Amended) The speech synthesis device according to any of 
claims claim 43, wherein the voice unit storage means operates to associate phonetic 
data representing a reading of voice unit with the voice unit data, and the selection 
means operates to handle voice unit data which is associated with phonetic data 
representing a reading matching with the reading of the voice unit composing the 
sentence, as voice unit whose reading is common with the voice unit. 

48. (Currently Amended) The speech synthesis device according to an y o f 
claims claim 44, wherein the voice unit storage means operates to associate phonetic 
data representing a reading of voice unit with the voice unit data, and the selection 
means operates to handle voice unit data which is associated with phonetic data 
representing a reading matching with the reading of the voice unit composing the 
sentence, as voice unit whose reading is common with the voice unit. 

49. (Currently Amended) The speech synthesis device according to any of 
claims claim 45, wherein the voice unit storage means operates to associate phonetic 
data representing a reading of voice unit with the voice unit data, and the selection 
means operates to handle voice unit data which is associated with phonetic data 
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representing a reading matching with the reading of the voice unit composing the 
sentence, as voice unit whose reading is common with the voice unit. 

50. (Previously Presented) A speech synthesis method performed by a speech 
synthesis device having storage means and processing means, the method comprising 
the steps of: 

storing in the storage means a plurality of pieces of voice unit data representing 
voice units; 

storing in the storage means a plurality of pieces of phoneme data each of which 
is a phoneme or comprises phoneme fragments composing a phoneme; 

inputting in the processing means sentence information representing a sentence 
to predict the cadence of voice units composing the sentence; 

selecting, in the processing means, voice units satisfying predetermined 
conditions out of the plurality of pieces of voice unit data stored in the storage means, 
wherein the predetermined conditions are that the voice unit data to be selected 
matches in its reading with the voice unit composing the sentence and has a correlation 
greater than a predetermined amount with a cadence prediction result; 

predicting in the processing means the cadence of voice units which have been 
decided not to satisfy the predetermined conditions; 

in the processing means, specifying phonemes contained in the voice unit 
decided not to satisfy the predetermined conditions out of the voice units composing the 
sentence, acquiring phoneme data representing the specified phoneme or phoneme 
fragments composing the specified phoneme from the storage means, converting the 
acquired phoneme data so that the phoneme or phoneme fragments represented by the 
acquired phoneme data matches with a cadence prediction result, and interconnecting 
the converted data, thereby synthesizing speech data representing a waveform of the 
voice unit; and 
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in the processing means, interconnecting the selected voice unit data and the 
synthesis speed data, thereby creating data representing synthesis speech. 

51. (Previously Presented) The speech synthesis method according to claim 50, 
wherein the processing means operates to select the voice unit data out of the plurality 
of pieces of voice unit data stored in the storage means under the predetermined 
conditions further including that the presence or absence of nasalization or 
devocalization of the voice unit data matches with the cadence prediction result. 

52. (Previously Presented) A computer readable medium which records a 
computer program causing a computer to operate as: 

voice unit storage means for storing a plurality of pieces of voice unit data 
representing voice units; 

phoneme storage means for storing a plurality of pieces of phoneme data each of 
which is a phoneme or comprises phoneme fragments composing a phoneme; 

cadence prediction means for inputting sentence information representing a 
sentence to predict the cadence of voice units comprising the sentence; 

selecting means for selecting voice unit data satisfying predetermined conditions 
out of the plurality of pieces of voice unit data stored in the voice unit storage means, 
wherein the predetermined conditions are that the voice unit data to be selected 
matches in its reading with the voice unit composing the sentence and has a correlation 
greater than a predetermined amount with a cadence prediction result by the cadence 
prediction means; 

missing part cadence prediction means for predicting the cadence of voice units 
which have been decided not to satisfy the predetermined conditions by the selection 
means; 

missing part synthesis means for specifying phonemes contained in the voice 
unit decided not to satisfy the predetermined condition by the selection means out of the 
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voice units composing the sentence, for acquiring phoneme data representing the 
specified phoneme or phoneme fragments composing the specified phoneme from the 
phoneme storage means, for converting the acquired phoneme data so that the 
phoneme or phoneme fragments represented by the acquired phoneme data matches 
with a cadence prediction result by the missing part cadence prediction means, and for 
interconnecting the converted data, thereby synthesizing speech data representing a 
waveform of the voice unit; and 

creation means for interconnecting the voice unit data selected by the selection 
means and the speech data synthesized by the missing part synthesis means, thereby 
creating data representing synthesis speech. 

53. (Previously Presented) The computer readable medium according to claim 
52, wherein the selection means selects the voice unit data out of the plurality of pieces 
of voice unit data stored in the voice unit storage means under the predetermined 
conditions further including that the presence or absence of nasalization or 
devocalization of the voice unit data matches with the cadence prediction result by the 
cadence prediction means. 
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REMARKS 

The Official Action mailed October 9, 2009, and the Advisory Action mailed 
February 18, 2010, have been received and their contents carefully noted. This 
response supplements the After Final Amendment filed February 9, 2010. Filed 
concurrently herewith is a Request for Two Month Extension of Time, which extends the 
shortened statutory period for response to March 9, 2010. Also, filed concurrently 
herewith is a Request for Continued Examination. Accordingly, the Applicant 
respectfully submits that this response is being timely filed. 

At this opportunity, the Applicant has amended claims 46-49 to correct a minor 
informality. Specifically, claims 46-49 previously included the phrase "any of claims" 
(generally used with multiple dependent claims); however, claims 46-49 are dependent 
on a single claim. Therefore, the phrase "any of claims" is not necessary. By the 
present Amendment, "any of claims" has been changed to "claim." 

For the reasons set forth in the Amendment filed February 9, 2010, all claims are 
believed to be in condition for allowance. 

Should the Examiner believe that anything further would be desirable to place 
this application in better condition for allowance, the Examiner is invited to contact the 
undersigned at the telephone number listed below. 
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The Commissioner is hereby authorized to charge fees under 37 C.F.R. §§ 1.16, 
1.17, 1.20(a), 1.20(b), 1.20(c), and 1.20(d) (except the Issue Fee) which may be 
required now or hereafter, or credit any overpayment to Deposit Account No. 50-2280. 



Respectfully submitted, 




Eric J. Robinson 
Reg. No. 38,285 



Robinson Intellectual Property Law Office, P.C. 
PMB 955 

21010 Southbank Street 
Potomac Falls, Virginia 20165 
(571)434-6789 



